Store flattened field data in binary doc values by jordan-powers · Pull Request #140246 · elastic/elasticsearch

jordan-powers · 2026-01-07T01:03:35Z

This PR updates the FlattenedFieldMapper to use binary doc values instead of sorted set doc values.

elasticsearchmachine · 2026-01-07T01:04:23Z

Hi @jordan-powers, I've created a changelog YAML for you.

…ary-doc-values-2

…s-2' into flattened-field-binary-doc-values-2

server/src/main/java/org/elasticsearch/index/fielddata/MultiValuedSortedBinaryDocValues.java

server/src/test/java/org/elasticsearch/index/mapper/flattened/FlattenedFieldMapperTests.java

elasticsearchmachine · 2026-01-08T15:48:20Z

Hi @jordan-powers, I've updated the changelog YAML for you.

…ary-doc-values-2

elasticsearchmachine · 2026-01-09T05:12:08Z

Pinging @elastic/es-storage-engine (Team:StorageEngine)

martijnvg

Thanks @jordan-powers this looks good.

Before this gets merged, maybe first open a pr that adds a rolling upgrade test for the flattened field type? (Similar to TextRollingUpgradeIT?)

server/src/main/java/org/elasticsearch/index/mapper/MultiValuedBinaryDocValuesField.java

server/src/test/java/org/elasticsearch/search/aggregations/support/ValuesSourceConfigTests.java

This PR adds a test to compare the synthetic source produced by flattened fields using the TSDB codec against flattened fields not using that codec. The test generates 32 random documents, indexes them into two indices (one using the codec, the other not using the codec), then retrieves the documents from the two indices and compares them. This is a test for elastic#140246, since once that PR is merged, flattened fields using the TSDB codec will use binary doc values while flattened fields using the default codec will continue to use sorted set doc values.

This PR adds a test to compare the synthetic source produced by flattened fields using the TSDB codec against flattened fields not using that codec. The test generates 32 random documents, indexes them into two indices (one using the codec, the other not using the codec), then retrieves the documents from the two indices and compares them. This is a test for #140246, since once that PR is merged, flattened fields using the TSDB codec will use binary doc values while flattened fields using the default codec will continue to use sorted set doc values.

jordan-powers · 2026-01-13T19:49:24Z

Opened #140611 to add a rolling upgrade test for flattened fields.

…ary-doc-values-2

martijnvg

Thanks Jordan! LGTM

martijnvg · 2026-01-15T10:46:50Z

...rc/main/java/org/elasticsearch/index/mapper/flattened/BinaryKeyedFlattenedLeafFieldData.java

+ * This class wraps the field data that is built directly on the keyed flattened field,
+ * and filters out values whose prefix doesn't match the requested key.
+ */
+public class BinaryKeyedFlattenedLeafFieldData implements LeafFieldData {


martijnvg · 2026-01-15T10:47:23Z

server/src/main/java/org/elasticsearch/index/mapper/flattened/FlattenedFieldMapper.java

        }
    }

+    public static class BinaryKeyedFlattenedFieldData implements IndexFieldData<LeafFieldData> {


…ary-doc-values-2

As of elastic#140246, flattened fields might use either binary or sorted set doc values, and the associated synthetic field loader can handle either format. This patch updates the name and javadoc of that field loader to document this behavior.

As of #140246, flattened fields might use either binary or sorted set doc values, and the associated synthetic field loader can handle either format. This patch updates the name and javadoc of that field loader to document this behavior.

…c#140489) This PR adds a test to compare the synthetic source produced by flattened fields using the TSDB codec against flattened fields not using that codec. The test generates 32 random documents, indexes them into two indices (one using the codec, the other not using the codec), then retrieves the documents from the two indices and compares them. This is a test for elastic#140246, since once that PR is merged, flattened fields using the TSDB codec will use binary doc values while flattened fields using the default codec will continue to use sorted set doc values.

This PR updates the FlattenedFieldMapper to use binary doc values instead of sorted set doc values

As of elastic#140246, flattened fields might use either binary or sorted set doc values, and the associated synthetic field loader can handle either format. This patch updates the name and javadoc of that field loader to document this behavior.

jordan-powers added 8 commits January 6, 2026 16:59

Directly add fields to doc in FlattenedFieldParser

89c6be5

Add SeparateCount#addToFieldInDoc

ac64f0d

Store flattened fields in binary doc values

a978bd0

Implement flattened fields synthetic source for binary doc values

3de1881

Update FlattenedFieldMapperTests

2164715

Support IndexFieldData for binary flattened fields

86ac0ad

Fix FlattenedFieldMapperTests#testSyntheticEmptyListNoDocValuesLoader

6cfbaaa

Fix FlattendFieldMapperTests

9171ba4

jordan-powers self-assigned this Jan 7, 2026

jordan-powers added >feature :StorageEngine/Mapping The storage related side of mappings labels Jan 7, 2026

elasticsearchmachine added the v9.4.0 label Jan 7, 2026

Update docs/changelog/140246.yaml

aaeaf77

jordan-powers added 8 commits January 6, 2026 17:04

Update BinaryKeyedFlattenedLeafFieldData javadoc

383fa2c

Merge remote-tracking branch 'upstream/main' into flattened-field-bin…

a3cb19b

…ary-doc-values-2

Fix ValuesSourceConfigTests#testFlattened

f98a4a5

Fix AggregatorTestCase#testSupportedFieldTypes

d9627a8

Fix FlattenedFieldMapperTests#testIndexTimeFieldData

1d46bec

Fix FlattenedFieldMapperTests#testSortShortcuts

754edfa

Merge remote-tracking branch 'upstream/main' into flattened-field-bin…

3b721dd

…ary-doc-values-2

Merge remote-tracking branch 'origin/flattened-field-binary-doc-value…

22d2168

…s-2' into flattened-field-binary-doc-values-2

jordan-powers commented Jan 7, 2026

View reviewed changes

server/src/main/java/org/elasticsearch/index/fielddata/MultiValuedSortedBinaryDocValues.java Outdated Show resolved Hide resolved

jordan-powers commented Jan 7, 2026

View reviewed changes

server/src/test/java/org/elasticsearch/index/mapper/flattened/FlattenedFieldMapperTests.java Outdated Show resolved Hide resolved

jordan-powers added >enhancement and removed >feature labels Jan 8, 2026

Update docs/changelog/140246.yaml

6353524

jordan-powers added 2 commits January 8, 2026 08:08

Update DocumenetLeafReader#getDocValuesSkipper to return null

a7bd1d8

Only use binary doc values if TSDB codec is enabled

beee1f1

jordan-powers added 4 commits January 8, 2026 11:12

Update KeywordFieldType to accept usesBinaryDocValues as a parameter

8503e1f

Pass usesBinaryDocValues in to flattened keyed field type

40e92bc

Remove unused FlattenedFieldMapperTests#getDocValuesField

8b5720c

Merge remote-tracking branch 'upstream/main' into flattened-field-bin…

c266a09

…ary-doc-values-2

jordan-powers marked this pull request as ready for review January 8, 2026 20:38

jordan-powers requested a review from martijnvg January 8, 2026 20:38

Merge remote-tracking branch 'upstream/main' into flattened-field-bin…

7a756ca

…ary-doc-values-2

elasticsearchmachine added the Team:StorageEngine label Jan 9, 2026

martijnvg reviewed Jan 9, 2026

View reviewed changes

server/src/main/java/org/elasticsearch/index/mapper/MultiValuedBinaryDocValuesField.java Outdated Show resolved Hide resolved

server/src/test/java/org/elasticsearch/search/aggregations/support/ValuesSourceConfigTests.java Show resolved Hide resolved

Rename to addToSeparateCountMultiBinaryFieldInDoc

055e9be

jordan-powers mentioned this pull request Jan 10, 2026

Add FlattenedFieldBinaryVsSortedSetDocValuesSyntheticSourceIT #140489

Merged

jordan-powers added 2 commits January 13, 2026 14:28

Merge remote-tracking branch 'upstream/main' into flattened-field-bin…

263cfea

…ary-doc-values-2

Merge remote-tracking branch 'upstream/main' into flattened-field-bin…

fb69560

…ary-doc-values-2

jordan-powers requested a review from martijnvg January 14, 2026 21:03

Merge remote-tracking branch 'upstream/main' into flattened-field-bin…

f2f518e

…ary-doc-values-2

martijnvg approved these changes Jan 15, 2026

View reviewed changes

jordan-powers added 2 commits January 15, 2026 07:48

Make some classes final

fcc7019

Merge remote-tracking branch 'upstream/main' into flattened-field-bin…

301071e

…ary-doc-values-2

jordan-powers enabled auto-merge (squash) January 15, 2026 15:51

jordan-powers merged commit 1c23ba4 into elastic:main Jan 15, 2026
36 checks passed

jordan-powers mentioned this pull request Jan 20, 2026

Rename FlattenedDocValuesSyntheticFieldLoader #141026

Merged

spinscale pushed a commit to spinscale/elasticsearch that referenced this pull request Jan 21, 2026

Store flattened field data in binary doc values (elastic#140246)

a1b0e97

This PR updates the FlattenedFieldMapper to use binary doc values instead of sorted set doc values

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Store flattened field data in binary doc values #140246

Store flattened field data in binary doc values #140246
jordan-powers merged 36 commits intoelastic:mainfrom
jordan-powers:flattened-field-binary-doc-values-2

jordan-powers commented Jan 7, 2026

Uh oh!

elasticsearchmachine commented Jan 7, 2026

Uh oh!

Uh oh!

Uh oh!

elasticsearchmachine commented Jan 8, 2026

Uh oh!

elasticsearchmachine commented Jan 9, 2026

Uh oh!

martijnvg left a comment

Uh oh!

Uh oh!

Uh oh!

jordan-powers commented Jan 13, 2026

Uh oh!

martijnvg left a comment

Uh oh!

martijnvg Jan 15, 2026

Uh oh!

martijnvg Jan 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jordan-powers commented Jan 7, 2026

Uh oh!

elasticsearchmachine commented Jan 7, 2026

Uh oh!

Uh oh!

Uh oh!

elasticsearchmachine commented Jan 8, 2026

Uh oh!

elasticsearchmachine commented Jan 9, 2026

Uh oh!

martijnvg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jordan-powers commented Jan 13, 2026

Uh oh!

martijnvg left a comment

Choose a reason for hiding this comment

Uh oh!

martijnvg Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

martijnvg Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants